In this paper, a semantic communication framework for image transmission is developed. In the investigated framework, a set of servers cooperatively transmit images to a set of users utilizing semantic communication techniques. To evaluate the performance of studied semantic communication system, a multimodal metric is proposed to measure the correlation between the extracted semantic information and the original image. To meet the ISS requirement of each user, each server must jointly determine the semantic information to be transmitted and the resource blocks (RBs) used for semantic information transmission. We formulate this problem as an optimization problem aiming to minimize each server's transmission latency while reaching the ISS requirement. To solve this problem, a value decomposition based entropy-maximized multi-agent reinforcement learning (RL) is proposed, which enables servers to coordinate for training and execute RB allocation in a distributed manner to approach to a globally optimal performance with less training iterations. Compared to traditional multi-agent RL, the proposed RL improves the valuable action exploration of servers and the probability of finding a globally optimal RB allocation policy based on local observation. Simulation results show that the proposed algorithm can reduce the transmission delay by up to 16.1% compared to traditional multi-agent RL.
translated by 谷歌翻译
雷达和摄像机多模式融合的环境感知对于自动驾驶至关重要,以提高准确性,完整性和稳健性。本文着重于如何利用毫米波(MMW)雷达和相机传感器融合进行3D对象检测。提出了一种新的方法,该方法在提出了更好的特征表示形式下意识到在鸟眼视图(BEV)下的特征级融合。首先,将雷达特征通过时间积累增强,并发送到时间空间编码器以进行雷达特征提取。同时,通过图像骨干和颈部模型获得了适应各种空间尺度的多尺度图像2D特征。然后,将图像功能转换为使用设计的视图变压器。此外,这项工作将多模式特征与称为点融合和ROI融合的两阶段融合模型融合在一起。最后,检测头会回归对象类别和3D位置。实验结果表明,所提出的方法在最重要的检测指标,平均平均精度(MAP)和NUSCENES检测分数(NDS)下实现了最先进的性能。
translated by 谷歌翻译
在本文中,提出了用于文本数据传输的语义通信框架。在研究的模型中,基站(BS)从文本数据中提取语义信息,并将其传输到每个用户。语义信息由由一组语义三元组组成的知识图(kg)建模。收到语义信息后,每个用户都使用图形到文本生成模型恢复原始文本。为了衡量所考虑的语义通信框架的性能,提出了共同捕获恢复文本的语义准确性和完整性的语义相似性(MSS)的指标。由于无线资源限制,BS可能无法将整个语义信息传输给每个用户并满足传输延迟约束。因此,BS必须为每个用户选择适当的资源块,并确定和将一部分语义信息传输给用户。因此,我们制定了一个优化问题,其目标是通过共同优化资源分配策略并确定要传输的部分语义信息来最大化总MSS。为了解决这个问题,提出了与注意力网络集成的基于近端优化的强化增强学习(RL)算法。所提出的算法可以使用注意网络在语义信息中评估每个三重组的重要性,然后在语义信息中三元组的重要性分布与总MSS之间建立关系。与传统的RL算法相比,所提出的算法可以动态调整其学习率,从而确保收敛到本地最佳解决方案。
translated by 谷歌翻译
除了最大化总收入外,许多行业的决策者还希望保证跨不同资源的公平消费,并避免饱和某些资源。在这些实际需求的推动下,本文研究了基于价格的网络收入管理问题,需求学习和公平性关注不同资源的消费。我们介绍了正式的收入,即以公平的正规化为目标,作为我们的目标,将公平性纳入收入最大化目标。我们提出了一种原始的偶型在线政策,并使用受到信心限制(UCB)的需求学习方法最大化正规化收入。我们采用了几种创新技术,以使我们的算法成为连续价格集和广泛的公平规则化的统一和计算高效的框架。我们的算法实现了$ \ tilde o(n^{5/2} \ sqrt {t})$的最坏遗憾,其中$ n $表示产品数,$ t $表示时间段。一些NRM示例中的数值实验证明了我们算法在平衡收入和公平性方面的有效性。
translated by 谷歌翻译
最近,变压器已成为解决车辆路由问题(VRP)的盛行深度建筑。但是,它在学习VRP的学习改进模型方面的有效性较小,因为其位置编码(PE)方法不适合表示VRP解决方案。本文介绍了一种新颖的双重协作变压器(DACT),以分别学习节点和位置特征的嵌入,而不是像现有的那样将它们融合在一起,以避免潜在的噪音和不相容的相关性。此外,位置特征通过新型的循环位置编码(CPE)方法嵌入,以使变压器有效捕获VRP溶液(即环状序列)的圆形性和对称性。我们使用近端政策优化训练DACT,并设计一种课程学习策略,以提高样本效率。我们应用DACT来解决旅行推销员问题(TSP)和电容的车辆路由问题(CVRP)。结果表明,我们的DACT优于现有的基于变压器的改进模型,并且在合成和基准实例上分别在不同问题大小上表现出更好的概括性能。
translated by 谷歌翻译
在本文中,我们研究了上下文搜索中的学习问题,该问题是由诸如第一价格拍卖,个性化医学实验和基于功能的定价实验之类的应用所激发的。特别是,对于到达上下文向量的顺序,每个上下文与基本值相关联,决策者要么在特定点进行查询,要么跳过上下文。决策者只会观察有关查询点与上下文相关的价值之间关系的二进制反馈。我们研究PAC学习设置,目标是在最少数量的查询中学习基础平均值函数。为了应对这一挑战,我们提出了一种三部分搜索方法,并结合了基于保证金的主动学习方法。我们表明,该算法仅需要制作$ o(1/\ varepsilon^2)$查询即可达到$ \ epsilon $估计的准确性。该样本复杂性大大降低了被动设置中所需的样品复杂性,至少$ \ omega(1/\ varepsilon^4)$。
translated by 谷歌翻译
域泛化旨在通过来自有限数量的培训环境的数据表现良好。尽管这项任务提出了提案算法,但理论上和经验仍然非常具有挑战性的评估其表现。分类匹配算法,如(条件)域对抗网络[Ganin等,2016,Long等人,2018]是流行的,享受经验的成功,但缺乏正式的保证。其他诸如不变风险最小化(IRM)的方法需要一定大量的大量培训环境 - 在虚假的特征空间的维度中,即使在[Rosenfeld等人是否提出的简单数据模型, 2021]。在该模型的变种下,我们表明,ERM和IRM都不能以$ O(d_s)$环境概括。然后,我们提出了一种迭代特征匹配算法,其保证具有高概率,以产生推广在仅看到$ O(\ log d_s)$环境之后推广的预测器。我们的结果为在具体的非竞争数据模型下,广泛使用的分销匹配算法系列提供了第一理论理由。
translated by 谷歌翻译
我们考虑具有未知实用程序参数的多项式logit模型(MNL)下的动态分类优化问题。本文研究的主要问题是$ \ varepsilon $ - 污染模型下的模型错误指定,该模型是强大统计和机器学习中的基本模型。特别是,在整个长度$ t $的销售范围内,我们假设客户根据$(1- \ varepsilon)$ - 时间段的$(1- \ varepsilon)的基础多项式logit选择模型进行购买,并进行任意购买取而代之的是在剩余的$ \ varepsilon $ - 分数中的决策。在此模型中,我们通过主动淘汰策略制定了新的强大在线分类优化政策。我们对遗憾建立上限和下界,并表明当分类能力恒定时,我们的政策是$ t $的最佳对数因素。分类能力具有恒定的上限。我们进一步制定了一种完全自适应策略,该政策不需要任何先验知识,即污染参数$ \ varepsilon $。如果存在最佳和亚最佳产品之间存在的亚临时差距,我们还建立了依赖差距的对数遗憾上限和已知的 - $ \ VAREPSILON $和UNKNOWER-$ \ \ VAREPSILON $案例。我们的仿真研究表明,我们的政策表现优于基于上置信度范围(UCB)和汤普森采样的现有政策。
translated by 谷歌翻译
Decentralized bilevel optimization has received increasing attention recently due to its foundational role in many emerging multi-agent learning paradigms (e.g., multi-agent meta-learning and multi-agent reinforcement learning) over peer-to-peer edge networks. However, to work with the limited computation and communication capabilities of edge networks, a major challenge in developing decentralized bilevel optimization techniques is to lower sample and communication complexities. This motivates us to develop a new decentralized bilevel optimization called DIAMOND (decentralized single-timescale stochastic approximation with momentum and gradient-tracking). The contributions of this paper are as follows: i) our DIAMOND algorithm adopts a single-loop structure rather than following the natural double-loop structure of bilevel optimization, which offers low computation and implementation complexity; ii) compared to existing approaches, the DIAMOND algorithm does not require any full gradient evaluations, which further reduces both sample and computational complexities; iii) through a careful integration of momentum information and gradient tracking techniques, we show that the DIAMOND algorithm enjoys $\mathcal{O}(\epsilon^{-3/2})$ in sample and communication complexities for achieving an $\epsilon$-stationary solution, both of which are independent of the dataset sizes and significantly outperform existing works. Extensive experiments also verify our theoretical findings.
translated by 谷歌翻译
In the field of antibody engineering, an essential task is to design a novel antibody whose paratopes bind to a specific antigen with correct epitopes. Understanding antibody structure and its paratope can facilitate a mechanistic understanding of its function. Therefore, antibody structure prediction from its sequence alone has always been a highly valuable problem for de novo antibody design. AlphaFold2, a breakthrough in the field of structural biology, provides a solution to predict protein structure based on protein sequences and computationally expensive coevolutionary multiple sequence alignments (MSAs). However, the computational efficiency and undesirable prediction accuracy of antibodies, especially on the complementarity-determining regions (CDRs) of antibodies limit their applications in the industrially high-throughput drug design. To learn an informative representation of antibodies, we employed a deep antibody language model (ALM) on curated sequences from the observed antibody space database via a transformer model. We also developed a novel model named xTrimoABFold to predict antibody structure from antibody sequence based on the pretrained ALM as well as efficient evoformers and structural modules. The model was trained end-to-end on the antibody structures in PDB by minimizing the ensemble loss of domain-specific focal loss on CDR and the frame-aligned point loss. xTrimoABFold outperforms AlphaFold2 and other protein language model based SOTAs, e.g., OmegaFold, HelixFold-Single, and IgFold with a large significant margin (30+\% improvement on RMSD) while performing 151 times faster than AlphaFold2. To the best of our knowledge, xTrimoABFold achieved state-of-the-art antibody structure prediction. Its improvement in both accuracy and efficiency makes it a valuable tool for de novo antibody design and could make further improvements in immuno-theory.
translated by 谷歌翻译